智能论文笔记

Scene Editing as Teleoperation: A Case Study in 6DoF Kit Assembly

Shubham Agrawal , Yulong Li , Jen-Shuo Liu , Steven K. Feiner , Shuran Song

分类：机器人 | 人工智能 | 计算机视觉 | 机器学习

2021-10-09

在机器人远程操作中的研究一直围绕着行动规范 - 从连续关节控制到离散的最终效果姿势控制。但是，这些以机器人为中心的接口通常需要具有广泛机器人专业知识的熟练操作员。为了使非专家用户可以访问远程操作，我们提出了框架“场景编辑为teleperation”（座位），其中关键的想法是将传统的“以机器人为中心的”界面转换为“以场景为中心的”界面 - 而是通过控制机器人，用户专注于通过操纵现实世界对象的数字双胞胎来指定任务的目标。结果，用户可以在没有任何机器人硬件的任何专业知识的情况下执行远程关系。为了实现这一目标，我们利用一种类别 - 不合时宜的场景完整算法，该算法将现实世界工作空间（带有未知对象）转换为可操作的虚拟场景表示和一个动作捕捉算法，并在生成机器人的动作计划之前对其进行改进的动作捕捉算法。为了训练算法，我们在过程中生成了一个大规模的，多样的套件组装数据集，其中包含模仿现实世界对象套件任务的对象芯对。我们在模拟和现实世界中的实验表明，我们的框架提高了6DOF套件组装任务的效率和成功率。一项用户研究表明，与替代机器人以机器人为中心的界面相比，座椅框架参与者获得了更高的任务成功率，并报告了主观工作量较低。可以在https://www.youtube.com/watch?v=-ndr3MKPBQQ上找到视频。

translated by 谷歌翻译

AFR-Net: Attention-Driven Fingerprint Recognition Network

Steven A. Grosz , Anil K. Jain

分类：计算机视觉

2022-11-25

The use of vision transformers (ViT) in computer vision is increasing due to limited inductive biases (e.g., locality, weight sharing, etc.) and increased scalability compared to other deep learning methods. This has led to some initial studies on the use of ViT for biometric recognition, including fingerprint recognition. In this work, we improve on these initial studies for transformers in fingerprint recognition by i.) evaluating additional attention-based architectures, ii.) scaling to larger and more diverse training and evaluation datasets, and iii.) combining the complimentary representations of attention-based and CNN-based embeddings for improved state-of-the-art (SOTA) fingerprint recognition (both authentication and identification). Our combined architecture, AFR-Net (Attention-Driven Fingerprint Recognition Network), outperforms several baseline transformer and CNN-based models, including a SOTA commercial fingerprint system, Verifinger v12.3, across intra-sensor, cross-sensor, and latent to rolled fingerprint matching datasets. Additionally, we propose a realignment strategy using local embeddings extracted from intermediate feature maps within the networks to refine the global embeddings in low certainty situations, which boosts the overall recognition accuracy significantly across each of the models. This realignment strategy requires no additional training and can be applied as a wrapper to any existing deep learning network (including attention-based, CNN-based, or both) to boost its performance.

translated by 谷歌翻译

The Robustness of Tether Friction in Non-idealized Terrains

Justin J. Page , Laura K. Treers , Steven Jens Jorgensen , Ronald S. Fearing , Hannah S. Stuart

分类：机器人

2022-08-22

减少的牵引力限制了移动机器人系统抵抗或施加大型外部负载的能力，例如拉紧有效载荷。一种简单且通用的解决方案是将束缚在天然发生的物体周围，以利用卡普斯坦效应并呈指数放大的固定力。实验表明，理想化的Capstan模型解释了对常见不规则室外物体（树木，岩石，柱子）经历的力放大。适用于可变环境条件，这种指数放大方法可以串联或与机器人团队并行利用单个或多个capstan对象。这种适应性允许一系列潜在配置，对于当对象无法完全包围或抓住时，特别有用。这些原则已通过移动平台证明（1）控制有效载荷的降低和逮捕，（2）以实现有效载荷的平面控制，以及（3）充当更大范围平台的锚点。我们显示了一个简单的系绳，包裹在沙子上的浅石头上，放大了低牵引力平台的持有力量，最多可达774倍。

translated by 谷歌翻译

PrintsGAN: Synthetic Fingerprint Generator

Joshua J. Engelsma , Steven A. Grosz , Anil K. Jain

分类：计算机视觉

2022-01-10

在指纹识别领域工作的研究人员的主要障碍是缺乏公开的，大规模的指纹数据集。确实存在的公开数据集包含每个手指的少数身份和印象。这限制了关于许多主题的研究，包括例如，使用深网络来学习固定长度指纹嵌入。因此，我们提出了Printsgan，一种能够产生独特指纹的合成指纹发生器以及给定指纹的多个印象。使用Printsgan，我们合成525,000个指纹的数据库（35,000个不同的手指，每次有15个印象）。接下来，我们通过训练深网络来提取来自指纹的固定长度嵌入的固定长度来显示Printsgan生成的数据集的实用程序。特别是，对我们的合成指纹培训并进行微调的嵌入式模型和在NIST SD302的25,000个印刷品上进行微调）在NIST SD4数据库上获得87.03％的焦点为87.03％（一个升压）当仅在NIST SD302上培训时，来自Tar = 73.37％）。普遍的合成指纹产生方法不会使I）缺乏现实主义或ii）无法产生多个印象。我们计划向公众释放我们的合成指纹数据库。

translated by 谷歌翻译

Pattern-Aware Data Augmentation for LiDAR 3D Object Detection

Jordan S. K. Hu , Steven L. Waslander

分类：计算机视觉

2021-11-30

自动驾驶数据集通常是倾斜的，特别是，缺乏距自工载体远距离的物体的训练数据。随着检测到的对象的距离增加，数据的不平衡导致性能下降。在本文中，我们提出了模式识的地面真相抽样，一种数据增强技术，该技术基于LIDAR的特征缩小对象的点云。具体地，我们模拟了用于深度的物体的自然发散点模式变化，以模拟更远的距离。因此，网络具有更多样化的训练示例，并且可以更有效地概括地检测更远的物体。我们评估了使用点删除或扰动方法的现有数据增强技术，并发现我们的方法优于所有这些。此外，我们建议使用相等的元素AP箱，以评估跨距离的3D对象探测器的性能。我们在距离大于25米的距离上的Kitti验证分裂上提高了PV-RCNN对车载PV-RCNN的性能。

translated by 谷歌翻译

Constructing High-Order Signed Distance Maps from Computed Tomography Data with Application to Bone Morphometry

Bryce A. Besler , Tannis D. Kemp , Nils D. Forkert , Steven K. Boyd

分类：计算机视觉

2021-11-02

提出了一种算法，用于构建与计算机断层扫描成像的两相材料构建高阶签名距离场。符号距离字段是高阶的，因为它没有与采样信号的距离变换相关联的量化伪像。使用最接近的点算法来解决窄带，该算法扩展到不是符号距离字段的隐式嵌入式。高阶快速扫描算法用于将窄带扩展到域的其余部分。在理想的隐式表面上验证了窄带和扩展方法的准确性顺序。该方法适用于10个精馏牛小梁骨的切除立方体。用这些受试者验证表面，相密度估计和局部形态学的定位。由于嵌入是高阶，梯度，因此可以在图像数据中本地局部地精确地估计曲线。

translated by 谷歌翻译

C2CL: Contact to Contactless Fingerprint Matching

Steven A. Grosz , Joshua J. Engelsma , Eryun Liu , Anil K. Jain

分类：计算机视觉 | 机器学习

2021-04-06

匹配的非接触式指纹或手指照片到基于接触的指纹印象在Covid-19尾之后，由于非接触式采集的优越性卫生以及能够以足够的分辨率捕获指纹照片的低成本移动电话的广泛可用性用于验证目的。本文介绍了一个名为C2CL的端到端自动化系统，包括移动手指照片捕获应用，预处理和匹配算法，以处理抑制先前交叉匹配方法的挑战;即i）低脊谷非接触式指纹对比，II）不同卷，俯仰，偏航和手指的距离，III的距离，III）非线性扭曲的基于接触的指纹，和VI）智能手机的不同图像质量。相机。我们的预处理算法段，增强，尺度和不可接受的非接触式指纹，而我们的匹配算法提取细节和纹理表示。使用我们的移动捕获App获取的206个受理接触式2D指纹和基于相应的基于接触的指纹的DataSet和来自206个受试者（每个受试者的2拇指和2个索引手指的指纹）用于评估我们所提出的算法的跨数据库性能。此外，在3个公共数据集上的额外实验结果表明，最先进的与非接触式指纹匹配（焦油为96.67％至98.30％，= 0.01％的焦油）显着提高。

translated by 谷歌翻译

Occupant Plugload Management for Demand Response in Commercial Buildings: Field Experimentation and Statistical Characterization

Chaitanya Poolla , Abraham K. Ishihara , Dan Liddell , Rodney Martin , Steven Rosenberg

分类：机器学习 | (统计)机器学习

2020-04-14

商业建筑约占美国总消耗的35％，其中近三分之二的化石燃料对环境产生了不利影响。通过控制闭环建筑环境中的乘员插头使用量来降低能源消耗，可以通过降低能源消耗来减轻这种不利影响。在这项工作中，我们进行了多个实验，以分析由于激励措施和/或视觉反馈而导致的乘员插头能量消耗的变化。这些激励措施需要以随机顺序管理的每日货币价值在5至50美元之间，视觉反馈由一个基于网络的仪表板组成，旨在提高参与者的能量意识。在位于加利福尼亚州莫菲特菲尔德的NASA AMES研究公园的政府办公室和大学建筑物中进行了实验。构建自回旋模型以预测存在外源变量的预期插头节省。对数据的分析显示，可以通过视觉反馈和激励机制来实现插头能量消耗的调节，这表明在循环控制架构中可能在商业建筑环境中有效。我们的发现表明，在办公室和大学环境中视觉反馈引起的平均能量降低分别约为9.52％和约21.61％。通过通过货币激励措施增强大学环境中的视觉反馈，发现平均减少能量为〜24.22％

translated by 谷歌翻译

Computing the Performance of A New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

James K. He , Sofía S. Villar , Lida Mavrogonatou

分类：机器学习

2023-01-03

Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.

translated by 谷歌翻译

MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding

Steven H. Wang , Antoine Scardigli , Leonard Tang , Wei Chen , Dimitry Levkin , Anya Chen , Spencer Ball , Thomas Woodside , Oliver Zhang , Dan Hendrycks

分类：自然语言处理

2023-01-02

Reading comprehension of legal text can be a particularly challenging task due to the length and complexity of legal clauses and a shortage of expert-annotated datasets. To address this challenge, we introduce the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset based on the American Bar Association's 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 total annotations. Our fine-tuned Transformer baselines show promising results, with models performing well above random on most questions. However, on a large subset of questions, there is still room for significant improvement. As the only expert-annotated merger agreement dataset, MAUD is valuable as a benchmark for both the legal profession and the NLP community.

translated by 谷歌翻译